Search CORE

87 research outputs found

Overparameterization: A Connection Between Software 1.0 and Software 2.0

Author: Carbin Michael
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 3rd Summit on Advances in Programming Languages (SNAPL 2019)
Publication date: 01/01/2019
Field of study

A new ecosystem of machine-learning driven applications, titled Software 2.0, has arisen that integrates neural networks into a variety of computational tasks. Such applications include image recognition, natural language processing, and other traditional machine learning tasks. However, these techniques have also grown to include other structured domains, such as program analysis and program optimization for which novel, domain-specific insights mate with model design. In this paper, we connect the world of Software 2.0 with that of traditional software - Software 1.0 - through overparameterization: a program may provide more computational capacity and precision than is necessary for the task at hand. In Software 2.0, overparamterization - when a machine learning model has more parameters than datapoints in the dataset - arises as a contemporary understanding of the ability for modern, gradient-based learning methods to learn models over complex datasets with high-accuracy. Specifically, the more parameters a model has, the better it learns. In Software 1.0, the results of the approximate computing community show that traditional software is also overparameterized in that software often simply computes results that are more precise than is required by the user. Approximate computing exploits this overparameterization to improve performance by eliminating unnecessary, excess computation. For example, one - of many techniques - is to reduce the precision of arithmetic in the application. In this paper, we argue that the gap between available precision and that that is required for either Software 1.0 or Software 2.0 is a fundamental aspect of software design that illustrates the balance between software designed for general-purposes and domain-adapted solutions. A general-purpose solution is easier to develop and maintain versus a domain-adapted solution. However, that ease comes at the expense of performance. We show that the approximate computing community and the machine learning community have developed overlapping techniques to improve performance by reducing overparameterization. We also show that because of these shared techniques, questions, concerns, and answers on how to construct software can translate from one software variant to the other

Dagstuhl Research Online Publication Server

Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks

Author: Amarasinghe Saman
Carbin Michael
Mendis Charith
Renda Alex
Publication venue
Publication date: 30/05/2019
Field of study

Predicting the number of clock cycles a processor takes to execute a block of assembly instructions in steady state (the throughput) is important for both compiler designers and performance engineers. Building an analytical model to do so is especially complicated in modern x86-64 Complex Instruction Set Computer (CISC) machines with sophisticated processor microarchitectures in that it is tedious, error prone, and must be performed from scratch for each processor generation. In this paper we present Ithemal, the first tool which learns to predict the throughput of a set of instructions. Ithemal uses a hierarchical LSTM--based approach to predict throughput based on the opcodes and operands of instructions in a basic block. We show that Ithemal is more accurate than state-of-the-art hand-written tools currently used in compiler backends and static machine code analyzers. In particular, our model has less than half the error of state-of-the-art analytical models (LLVM's llvm-mca and Intel's IACA). Ithemal is also able to predict these throughput values just as fast as the aforementioned tools, and is easily ported across a variety of processor microarchitectures with minimal developer effort.Comment: Published at 36th International Conference on Machine Learning (ICML) 201

arXiv.org e-Print Archive

DSpace@MIT

Tower: Data Structures in Quantum Superposition

Author: Carbin Michael
Yuan Charles
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/09/2022
Field of study

Emerging quantum algorithms for problems such as element distinctness, subset sum, and closest pair demonstrate computational advantages by relying on abstract data structures. Practically realizing such an algorithm as a program for a quantum computer requires an efficient implementation of the data structure whose operations correspond to unitary operators that manipulate quantum superpositions of data. To correctly operate in superposition, an implementation must satisfy three properties -- reversibility, history independence, and bounded-time execution. Standard implementations, such as the representation of an abstract set as a hash table, fail these properties, calling for tools to develop specialized implementations. In this work, we present Core Tower, the first language for quantum programming with random-access memory. Core Tower enables the developer to implement data structures as pointer-based, linked data. It features a reversible semantics enabling every valid program to be translated to a unitary quantum circuit. We present Boson, the first memory allocator that supports reversible, history-independent, and constant-time dynamic memory allocation in quantum superposition. We also present Tower, a language for quantum programming with recursively defined data structures. Tower features a type system that bounds all recursion using classical parameters as is necessary for a program to execute on a quantum computer. Using Tower, we implement Ground, the first quantum library of data structures, including lists, stacks, queues, strings, and sets. We provide the first executable implementation of sets that satisfies all three mandated properties of reversibility, history independence, and bounded-time execution.Comment: 30 pages, 22 figures. [v2] add discussion of concurrent work in Sec 1.4 and add acknowledgements section. [v3] camera-ready version, incorporates revisions following conference revie

arXiv.org e-Print Archive

DSpace@MIT

Typesafety for Explicitly-Coded Probabilistic Inference Procedures

Author: Atkinson Eric
Carbin Michael
Publication venue
Publication date: 13/11/2017
Field of study

Researchers have recently proposed several systems that ease the process of developing Bayesian probabilistic inference algorithms. These include systems for automatic inference algorithm synthesis as well as stronger abstractions for manual algorithm development. However, existing systems whose performance relies on the developer manually constructing a part of the inference algorithm have limited support for reasoning about the correctness of the resulting algorithm. In this paper, we present Shuffle, a programming language for developing manual inference algorithms that enforces 1) the basic rules of probability theory and 2) statistical dependencies of the algorithm's corresponding probabilistic model. We have used Shuffle to develop inference algorithms for several standard probabilistic models. Our results demonstrate that Shuffle enables a developer to deliver performant implementations of these algorithms with the added benefit of Shuffle's correctness guarantees

DSpace@MIT

Programming and Reasoning with Partial Observability

Author: Atkinson Eric
Carbin Michael
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/01/2021
Field of study

Computer programs are increasingly being deployed in partially-observable environments. A partially observable environment is an environment whose state is not completely visible to the program, but from which the program receives partial observations. Developers typically deal with partial observability by writing a state estimator that, given observations, attempts to deduce the hidden state of the environment. In safety-critical domains, to formally verify safety properties developers may write an environment model. The model captures the relationship between observations and hidden states and is used to prove the software correct. In this paper, we present a new methodology for writing and verifying programs in partially observable environments. We present belief programming, a programming methodology where developers write an environment model that the program runtime automatically uses to perform state estimation. A belief program dynamically updates and queries a belief state that captures the possible states the environment could be in. To enable verification, we present Epistemic Hoare Logic that reasons about the possible belief states of a belief program the same way that classical Hoare logic reasons about the possible states of a program. We develop these concepts by defining a semantics and a program logic for a simple core language called BLIMP. In a case study, we show how belief programming could be used to write and verify a controller for the Mars Polar Lander in BLIMP. We present an implementation of BLIMP called CBLIMP and evaluate it to determine the feasibility of belief programming

arXiv.org e-Print Archive

DSpace@MIT

Quantum Control Machine: The Limits of Quantum Programs as Data

Author: Carbin Michael
Villanyi Agnes
Yuan Charles
Publication venue
Publication date: 28/04/2023
Field of study

Quantum algorithms for factorization, search, and simulation obtain computational advantage by performing control flow such as branching and iteration based on the value of quantum data in superposition. Complicating realization of these algorithms is the fact that in predominant quantum machine models, all control flow as embodied by the program counter is classical, and cannot exist in superposition. In this work, we identify that an alternative model to enable a program counter in superposition faces an obstacle -- no such machine can correctly support control flow constructs with non-injective semantics, including the conventional conditional jump. In fact, prior attempts to support this instruction cause programs to inappropriately collapse the superposition of data, meaning that quantum advantage is lost. We present a quantum machine model that supports both quantum effects on data and data-dependent control flow, using variants of conditional jump with injective semantics. We identify the necessary condition for programs for such a machine to preserve superposition of data, and show that expressible programs coincide with the unitary quantum circuits.Comment: 12 pages, 8 figure

arXiv.org e-Print Archive